3,128 research outputs found
Robust input representations for low-resource information extraction
Recent advances in the field of natural language processing were achieved with deep learning models. This led to a wide range of new research questions concerning the stability of such large-scale systems and their applicability beyond well-studied tasks and datasets, such as information extraction in non-standard domains and languages, in particular, in low-resource environments. In this work, we address these challenges and make important contributions across fields such as representation learning and transfer learning by proposing novel model architectures and training strategies to overcome existing limitations, including a lack of training resources, domain mismatches and language barriers. In particular, we propose solutions to close the domain gap between representation models by, e.g., domain-adaptive pre-training or our novel meta-embedding architecture for creating a joint representations of multiple embedding methods. Our broad set of experiments demonstrates state-of-the-art performance of our methods for various sequence tagging and classification tasks and highlight their robustness in challenging low-resource settings across languages and domains.Die jüngsten Fortschritte auf dem Gebiet der Verarbeitung natürlicher Sprache wurden mit Deep-Learning-Modellen erzielt. Dies führte zu einer Vielzahl neuer Forschungsfragen bezüglich der Stabilität solcher großen Systeme und ihrer Anwendbarkeit über gut untersuchte Aufgaben und Datensätze hinaus, wie z. B. die Informationsextraktion für Nicht-Standardsprachen, aber auch Textdomänen und Aufgaben, für die selbst im Englischen nur wenige Trainingsdaten zur Verfügung stehen. In dieser Arbeit gehen wir auf diese Herausforderungen ein und leisten wichtige Beiträge in Bereichen wie Repräsentationslernen und Transferlernen, indem wir neuartige Modellarchitekturen und Trainingsstrategien vorschlagen, um bestehende Beschränkungen zu überwinden, darunter fehlende Trainingsressourcen, ungesehene Domänen und Sprachbarrieren. Insbesondere schlagen wir Lösungen vor, um die Domänenlücke zwischen Repräsentationsmodellen zu schließen, z.B. durch domänenadaptives Vortrainieren oder unsere neuartige Meta-Embedding-Architektur zur Erstellung einer gemeinsamen Repräsentation mehrerer Embeddingmethoden. Unsere umfassende Evaluierung demonstriert die Leistungsfähigkeit unserer Methoden für verschiedene Klassifizierungsaufgaben auf Word und Satzebene und unterstreicht ihre Robustheit in anspruchsvollen, ressourcenarmen Umgebungen in verschiedenen Sprachen und Domänen
NLNDE: Enhancing Neural Sequence Taggers with Attention and Noisy Channel for Robust Pharmacological Entity Detection
Named entity recognition has been extensively studied on English news texts.
However, the transfer to other domains and languages is still a challenging
problem. In this paper, we describe the system with which we participated in
the first subtrack of the PharmaCoNER competition of the BioNLP Open Shared
Tasks 2019. Aiming at pharmacological entity detection in Spanish texts, the
task provides a non-standard domain and language setting. However, we propose
an architecture that requires neither language nor domain expertise. We treat
the task as a sequence labeling task and experiment with attention-based
embedding selection and the training on automatically annotated data to further
improve our system's performance. Our system achieves promising results,
especially by combining the different techniques, and reaches up to 88.6% F1 in
the competition.Comment: Published at BioNLP-OST@EMNLP 201
Feature-Dependent Confusion Matrices for Low-Resource NER Labeling with Noisy Labels
In low-resource settings, the performance of supervised labeling models can
be improved with automatically annotated or distantly supervised data, which is
cheap to create but often noisy. Previous works have shown that significant
improvements can be reached by injecting information about the confusion
between clean and noisy labels in this additional training data into the
classifier training. However, for noise estimation, these approaches either do
not take the input features (in our case word embeddings) into account, or they
need to learn the noise modeling from scratch which can be difficult in a
low-resource setting. We propose to cluster the training data using the input
features and then compute different confusion matrices for each cluster. To the
best of our knowledge, our approach is the first to leverage feature-dependent
noise modeling with pre-initialized confusion matrices. We evaluate on
low-resource named entity recognition settings in several languages, showing
that our methods improve upon other confusion-matrix based methods by up to 9%.Comment: Published at EMNLP-IJCNLP 201
SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for Classification in Low-Resource Domains
Prompting pre-trained language models leads to promising results across
natural language processing tasks but is less effective when applied in
low-resource domains, due to the domain gap between the pre-training data and
the downstream task. In this work, we bridge this gap with a novel and
lightweight prompting methodology called SwitchPrompt for the adaptation of
language models trained on datasets from the general domain to diverse
low-resource domains. Using domain-specific keywords with a trainable gated
prompt, SwitchPrompt offers domain-oriented prompting, that is, effective
guidance on the target domains for general-domain language models. Our few-shot
experiments on three text classification benchmarks demonstrate the efficacy of
the general-domain pre-trained language models when used with SwitchPrompt.
They often even outperform their domain-specific counterparts trained with
baseline state-of-the-art prompting methods by up to 10.7% performance increase
in accuracy. This result indicates that SwitchPrompt effectively reduces the
need for domain-specific language model pre-training.Comment: Accepted at EACL 2023 Main Conferenc
- …